Information-Agnostic Flow Scheduling for Commodity Data Centers
نویسندگان
چکیده
Many existing data center network (DCN) flow scheduling schemes minimize flow completion times (FCT) based on prior knowledge of flows and custom switch functions, making them superior in performance but hard to use in practice. By contrast, we seek to minimize FCT with no prior knowledge and existing commodity switch hardware. To this end, we present PIAS1, a DCN flow scheduling mechanism that aims to minimize FCT by mimicking Shortest Job First (SJF) on the premise that flow size is not known a priori. At its heart, PIAS leverages multiple priority queues available in existing commodity switches to implement a Multiple Level Feedback Queue (MLFQ), in which a PIAS flow is gradually demoted from higher-priority queues to lower-priority queues based on the number of bytes it has sent. As a result, short flows are likely to be finished in the first few high-priority queues and thus be prioritized over long flows in general, which enables PIAS to emulate SJF without knowing flow sizes beforehand. We have implemented a PIAS prototype and evaluated PIAS through both testbed experiments and ns2 simulations. We show that PIAS is readily deployable with commodity switches and backward compatible with legacy TCP/IP stacks. Our evaluation results show that PIAS significantly outperforms existing information-agnostic schemes. For example, it reduces FCT by up to 50% and 40% over DCTCP [11] and L2DCT [27] respectively; and it only has a 4.9% performance gap to an ideal information-aware scheme, pFabric [13], for short flows under a production DCN workload. ∗Correspondent author: [email protected]. 1PIAS, Practical Information-Agnostic flow Scheduling, was first introduced in an earlier workshop paper [14] which sketched a preliminary design and the initial results.
منابع مشابه
Flow Scheduling Strategies for Minimizing Flow Completion Times in Information-agnostic Data Center Networks
Minimizing the flow completion time (FCT) is widely considered as an important optimization goal in designing data center networks. However, existing schemes either rely on the precondition that the size and deadline of each flow is known in advance, or require modifying the switch hardware, which is hard to implement in practice. In this paper, we present MCPF, a flexible and dynamic flow sche...
متن کاملPhurti: Application and Network-aware Flow Scheduling for Mapreduce
Traffic for a typical MapReduce job in a datacenter consists of multiple network flows. Traditionally, network resources have been allocated to optimize network-level metrics such as flow completion time or throughput. Some recent schemes propose using application-aware scheduling which can reduce the average job completion time. However, most of them treat the core network as a black box with ...
متن کاملMulti-commodity flow and station logistics resolution for train unit scheduling
Proceedings Paper: Kwan, RSK, Lin, Z, Copado-Mendez, PJ et al. (1 more author) (2017) Multi-commodity flow and station logistics resolution for train unit scheduling. In: Gunawan, A, Kendall, G, Soon, LL, McCollum, B and Seow, H-V, (eds.) Proceedings of the 8th Multidisciplinary International Conference on Scheduling: Theory and Applications. Multidisciplinary International Scheduling Conferenc...
متن کاملHedera: Dynamic Flow Scheduling for Data Center Networks
Today’s data centers offer tremendous aggregate bandwidth to clusters of tens of thousands of machines. However, because of limited port densities in even the highest-end switches, data center topologies typically consist of multi-rooted trees with many equal-cost paths between any given pair of hosts. Existing IP multipathing protocols usually rely on per-flow static hashing and can cause subs...
متن کاملMaking Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers
Trends towards consolidation and higher-density computing configurations make the problem of heat management one of the critical challenges in emerging data centers. Conventional approaches to addressing this problem have focused at the facilities level to develop new cooling technologies or optimize the delivery of cooling. In contrast to these approaches, our paper explores an alternate dimen...
متن کامل